Merging Smart-seq2 and 10X Datasets
We have quality-controlled the 10X data and the SS2 data and now are left with the following objects:
10X 5K data - pb_sex_filtered
10X 30K data - pb_30k_sex_filtered
SS2 mutant data - ss2_mutants_final
[1] "patchwork is loaded correctly"
[1] "viridis is loaded correctly"
[1] "Seurat is loaded correctly"
[1] "cowplot is loaded correctly"
[1] "gridExtra is loaded correctly"
[1] "grid is loaded correctly"
[1] "Hmisc is loaded correctly"
[1] "reshape2 is loaded correctly"
[1] "dplyr is loaded correctly"
=======
[1] "Seurat is loaded correctly"
[1] "cowplot is loaded correctly"
[1] "gridExtra is loaded correctly"
Loading required package: grid
[1] "grid is loaded correctly"
[1] "Hmisc is loaded correctly"
[1] "dplyr is loaded correctly"
Loading required package: monocle3
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:parallel’:
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport,
clusterMap, parApply, parCapply, parLapply, parLapplyLB, parRapply, parSapply,
parSapplyLB
The following object is masked from ‘package:gridExtra’:
combine
The following objects are masked from ‘package:dplyr’:
combine, intersect, setdiff, union
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, append, as.data.frame, basename, cbind, colnames, dirname,
do.call, duplicated, eval, evalq, Filter, Find, get, grep, grepl, intersect,
is.unsorted, lapply, Map, mapply, match, mget, order, paste, pmax, pmax.int,
pmin, pmin.int, Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort,
table, tapply, union, unique, unsplit, which, which.max, which.min
Welcome to Bioconductor
Vignettes contain introductory material; view with 'browseVignettes()'. To cite
Bioconductor, see 'citation("Biobase")', and for packages 'citation("pkgname")'.
Attaching package: ‘Biobase’
The following object is masked from ‘package:Hmisc’:
contents
Loading required package: SingleCellExperiment
Loading required package: SummarizedExperiment
Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: S4Vectors
Attaching package: ‘S4Vectors’
The following objects are masked from ‘package:dplyr’:
first, rename
The following object is masked from ‘package:base’:
expand.grid
Loading required package: IRanges
Attaching package: ‘IRanges’
The following objects are masked from ‘package:dplyr’:
collapse, desc, slice
Loading required package: GenomeInfoDb
Loading required package: DelayedArray
Loading required package: matrixStats
Attaching package: ‘matrixStats’
The following objects are masked from ‘package:Biobase’:
anyMissing, rowMedians
The following object is masked from ‘package:dplyr’:
count
Attaching package: ‘DelayedArray’
The following objects are masked from ‘package:matrixStats’:
colMaxs, colMins, colRanges, rowMaxs, rowMins, rowRanges
The following objects are masked from ‘package:base’:
aperm, apply, rowsum
Attaching package: ‘SummarizedExperiment’
The following object is masked from ‘package:Seurat’:
Assays
Attaching package: ‘monocle3’
The following objects are masked from ‘package:Biobase’:
exprs, fData, fData<-, pData, pData<-
[1] "monocle3 is loaded correctly"
[1] "reshape2 is loaded correctly"
>>>>>>> directory_change
screen hits
## EDIT - change this to the excel table once we have it finalised for the screen
screen_hits <- c("PBANKA-0516300",
"PBANKA-1217700",
"PBANKA-0409100",
"PBANKA-1034300",
"PBANKA-1437500",
"PBANKA-0827500",
"PBANKA-0824300",
"PBANKA-1426900",
"PBANKA-0105300",
"PBANKA-0921100",
"PBANKA-1002400",
"PBANKA-0829400",
"PBANKA-1347200",
"PBANKA-0828000",
"PBANKA-0902300",
"PBANKA-1418100",
"PBANKA-1435200",
"PBANKA-1454800",
"PBANKA-0712300",
"PBANKA-0410500",
"PBANKA-1144800",
"PBANKA-1231600",
"PBANKA-0503200",
"PBANKA-0308900",
"PBANKA-1214700",
"PBANKA-0709900",
"PBANKA-0311900",
"PBANKA-0716500",
"PBANKA-1447900",
"PBANKA-0102200",
"PBANKA-0713500",
"PBANKA-0102400",
"PBANKA-1302700",
"PBANKA-1235900",
"PBANKA-0401100",
"PBANKA-0413400",
"PBANKA-1126900",
"PBANKA-1425900",
"PBANKA-0418300",
"PBANKA-1464600",
"PBANKA-0806000")
load in datasets
<<<<<<< HEAD## load the 30k 10X dataset
#pb_30k_sex_filtered <- readRDS("pb_30k_sex_filtered.RDS")
## load the 10X dataset
pb_sex_filtered <- readRDS("/Users/Andy/pb_sex_filtered.RDS")
#pb_sex_filtered <- readRDS("/Users/Andy/GCSKO/GCSKO_analysis_git/data_to_export/pb_sex_filtered.RDS")
## load the SS2 dataset
ss2_mutants_final <- readRDS("/Users/Andy/ss2_mutants_final.RDS")
## inspect
paste("10x dataset")
pb_sex_filtered
paste("Smart-seq2 dataset")
ss2_mutants_final
=======
## load the 30k 10X dataset
#pb_30k_sex_filtered <- readRDS("pb_30k_sex_filtered.RDS")
## load the 10X dataset
pb_sex_filtered <- readRDS("/Users/Andy/pb_sex_filtered.RDS")
cannot open compressed file '/Users/Andy/pb_sex_filtered.RDS', probable reason 'No such file or directory'Error in gzfile(file, "rb") : cannot open the connection
>>>>>>> directory_change
## extract 10x data
tenx_5k_counts <- as.matrix(pb_sex_filtered@assays$RNA@counts)
tenx_5k_pheno <- pb_sex_filtered@meta.data
## Create fresh object
tenx_5k_counts_to_integrate <- CreateSeuratObject(counts = tenx_5k_counts, meta.data = tenx_5k_pheno, min.cells = 0, min.features = 0, project = "GCSKO")
## add experiment meta data
tenx_5k_counts_to_integrate@meta.data$experiment <- "tenx_5k"
## inspect
tenx_5k_counts_to_integrate
=======
>>>>>>> directory_change
We need to make sure the mutant data is compatible with the 10X data. the 10X data has fewer genes represented so we need to find the intersect of the two before integration.
<<<<<<< HEAD## extract SS2 data
mutant_counts_for_integration <- as.matrix(ss2_mutants_final@assays$RNA@counts)
mutant_pheno_for_integration <- ss2_mutants_final@meta.data
## change counts so the :rRNA and :tRNA are not there:
rownames(mutant_counts_for_integration) <- gsub(":ncRNA", "", gsub(":rRNA", "", gsub(":tRNA", "", rownames(mutant_counts_for_integration))))
## change the gene names so that they are - rather than _:
rownames(mutant_counts_for_integration) <- gsub("_", "-", rownames(mutant_counts_for_integration))
## calculate how many of the genes overlap - 10x does start out with 5098 vs 5245
genes_in_tenx_dataset <- intersect(rownames(tenx_5k_counts), rownames(mutant_counts_for_integration))
## print number of genes that overlap
dim(mutant_counts_for_integration)
## subset the mutant counts to contain only 10x genes
mutant_counts_for_integration <- mutant_counts_for_integration[which(rownames(mutant_counts_for_integration) %in% genes_in_tenx_dataset), ]
## print result of genes that overlap
dim(mutant_counts_for_integration)
## make Seurat object:
GCSKO_mutants <- CreateSeuratObject(counts = mutant_counts_for_integration, meta.data = mutant_pheno_for_integration, min.cells = 0, min.features = 0, project = "GCSKO")
## add experiment meta data
GCSKO_mutants@meta.data$experiment <- "mutants"
## inspect
GCSKO_mutants
## extract SS2 data
mutant_counts_for_integration <- as.matrix(ss2_mutants_final@assays$RNA@counts)
mutant_pheno_for_integration <- ss2_mutants_final@meta.data
## change counts so the :rRNA and :tRNA are not there:
rownames(mutant_counts_for_integration) <- gsub(":ncRNA", "", gsub(":rRNA", "", gsub(":tRNA", "", rownames(mutant_counts_for_integration))))
## change the gene names so that they are - rather than _:
rownames(mutant_counts_for_integration) <- gsub("_", "-", rownames(mutant_counts_for_integration))
## calculate how many of the genes overlap - 10x does start out with 5098 vs 5245
genes_in_tenx_dataset <- intersect(rownames(tenx_5k_counts), rownames(mutant_counts_for_integration))
## print number of genes that overlap
dim(mutant_counts_for_integration)
[1] 5245 3031
## subset the mutant counts to contain only 10x genes
mutant_counts_for_integration <- mutant_counts_for_integration[which(rownames(mutant_counts_for_integration) %in% genes_in_tenx_dataset), ]
## print result of genes that overlap
dim(mutant_counts_for_integration)
[1] 5018 3031
## make Seurat object:
GCSKO_mutants <- CreateSeuratObject(counts = mutant_counts_for_integration, meta.data = mutant_pheno_for_integration, min.cells = 0, min.features = 0, project = "GCSKO")
## add experiment meta data
GCSKO_mutants@meta.data$experiment <- "mutants"
## inspect
GCSKO_mutants
An object of class Seurat
5018 features across 3031 samples within 1 assay
Active assay: RNA (5018 features, 0 variable features)
=======
>>>>>>> directory_change
create list and normalise:
## Find anchors
tenx.mutant.anchors <- FindIntegrationAnchors(object.list = tenx.mutant.list, dims = 1:21, verbose = FALSE)
## Integrate data
tenx.mutant.integrated <- IntegrateData(anchorset = tenx.mutant.anchors, dims = 1:21, verbose = FALSE, features.to.integrate = genes_in_tenx_dataset)
=======
>>>>>>> directory_change
# Make the default assay integrated
DefaultAssay(tenx.mutant.integrated) <- "integrated"
# Run the standard workflow for visualization and clustering
tenx.mutant.integrated <- ScaleData(tenx.mutant.integrated, verbose = FALSE)
tenx.mutant.integrated <- RunPCA(tenx.mutant.integrated, npcs = 30, verbose = FALSE)
## inspect PCs
ElbowPlot(tenx.mutant.integrated, ndims = 30, reduction = "pca")
=======
>>>>>>> directory_change
Run inital UMAP
<<<<<<< HEAD## Run UMAP
tenx.mutant.integrated <- RunUMAP(tenx.mutant.integrated, reduction = "pca", dims = 1:8, n.neighbors = 50, seed.use = 1234, min.dist = 0.5, repulsion.strength = 0.05)
See distribution by: altogether, experiment, and mutant ID
## Run UMAP
tenx.mutant.integrated <- RunUMAP(tenx.mutant.integrated, reduction = "pca", dims = 1:8, n.neighbors = 50, seed.use = 1234, min.dist = 0.5, repulsion.strength = 0.05)
The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using the cosine metric
To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation'
This message will be shown once per session13:28:30 UMAP embedding parameters a = 0.583 b = 1.334
Found more than one class "dist" in cache; using the first, from namespace 'spam'
Also defined by ‘BiocGenerics’
13:28:30 Read 9222 rows and found 8 numeric columns
13:28:30 Using Annoy for neighbor search, n_neighbors = 50
Found more than one class "dist" in cache; using the first, from namespace 'spam'
Also defined by ‘BiocGenerics’
13:28:30 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
13:28:33 Writing NN index file to temp file /var/folders/wj/rztzclxn1t10cl2sk0plbf3r0000gn/T//Rtmp4lB433/file154091d65f158
13:28:34 Searching Annoy index using 1 thread, search_k = 5000
13:28:41 Annoy recall = 100%
13:28:42 Commencing smooth kNN distance calibration using 1 thread
13:28:45 Initializing from normalized Laplacian + noise
13:28:48 Commencing optimization for 500 epochs, with 604438 positive edges
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
13:29:08 Optimization finished
=======
>>>>>>> directory_change
After optimisation, the following UMAP can be calculated:
Now store these reversed embeddings in a new slot
<<<<<<< HEAD## plot
dp1 <- DimPlot(tenx.mutant.integrated, label = TRUE, repel = FALSE, pt.size = 0.05, dims = c(2,1)) +
## fix the axis
coord_fixed() +
## reverse the scale
scale_x_reverse()
## view
dp1
Now store these reversed embeddings in a new slot
>>>>>>> directory_changeRecluster dataset now that it is integrated. We will cluster with a number of resolutions to begin with to see how this affects the number and nature of the clusters.
<<<<<<< HEAD## copy old clusters
tenx.mutant.integrated <- AddMetaData(tenx.mutant.integrated, tenx.mutant.integrated@meta.data$RNA_snn_res.1, col.name = "pre_integration_clusters")
## generate new clusters at low resolution
## 1
tenx.mutant.integrated <- FindNeighbors(tenx.mutant.integrated, dims = 1:15)
tenx.mutant.integrated <- FindClusters(tenx.mutant.integrated, resolution = 1, random.seed = 42, algorithm = 2)
## generate new clusters at low resolution
## 1.2
tenx.mutant.integrated <- FindNeighbors(tenx.mutant.integrated, dims = 1:15)
tenx.mutant.integrated <- FindClusters(tenx.mutant.integrated, resolution = 1.2, random.seed = 42, algorithm = 2)
## generate new clusters at low resolution
## 1.5
tenx.mutant.integrated <- FindNeighbors(tenx.mutant.integrated, dims = 1:15)
tenx.mutant.integrated <- FindClusters(tenx.mutant.integrated, resolution = 1.5, random.seed = 42, algorithm = 2)
## generate new clusters at mid resolution
## 2
tenx.mutant.integrated <- FindNeighbors(tenx.mutant.integrated, dims = 1:15)
tenx.mutant.integrated <- FindClusters(tenx.mutant.integrated, resolution = 2, random.seed = 42, algorithm = 2)
## generate new clusters at high resolution
## 4
tenx.mutant.integrated <- FindNeighbors(tenx.mutant.integrated, dims = 1:15)
tenx.mutant.integrated <- FindClusters(tenx.mutant.integrated, resolution = 4, random.seed = 42, algorithm = 2)
## print identities
#head(Idents(tenx.mutant.integrated), 10)
=======
>>>>>>> directory_change
View
<<<<<<< HEAD## Plot
DimPlot(tenx.mutant.integrated, label = TRUE, repel = TRUE, pt.size = 0.05, dims = c(2,1), reduction = "DIM_UMAP", group.by = "integrated_snn_res.1") + coord_fixed()
=======
>>>>>>> directory_change
Make individual plots highlighting where cells in each cluster fall
plot
<<<<<<< HEAD## for loop which takes each cluster and makes a list of cells and then plots a highlighted plot and adds it to a list
## make a blank list
list_UMAPs_by_cluster <- vector(mode = "list", length = length(levels(tenx.mutant.integrated@meta.data$integrated_snn_res.1)))
## for loop
for(i in seq_along(levels(tenx.mutant.integrated@meta.data$integrated_snn_res.1))){
## make a list of cells
list_of_cells <- rownames(tenx.mutant.integrated@meta.data[which(tenx.mutant.integrated@meta.data$integrated_snn_res.1 == levels(tenx.mutant.integrated@meta.data$integrated_snn_res.1)[i]), ])
uamp_plot <- DimPlot(tenx.mutant.integrated, label = FALSE, repel = TRUE, pt.size = 0.1, cells.highlight = list_of_cells, dims = c(2,1), reduction = "DIM_UMAP") +
## fix coordinates
coord_fixed() +
scale_color_manual(values=c("#000000", "#f54e1e")) +
theme_void() +
labs(title = paste("cluster", levels(tenx.mutant.integrated@meta.data$integrated_snn_res.1)[i])) +
theme(plot.title = element_text(hjust = 0.5), legend.position = "none")
## add to the list
list_UMAPs_by_cluster[[i]] <- uamp_plot
}
## check number of clusters
#length(list_UMAPs_by_cluster)
=======
>>>>>>> directory_change
View
<<<<<<< HEAD## this function writes the next bit of code for you
## put it into the console and paste the response
#ploty <- c()
#for(i in seq_along(levels(tenx.mutant.integrated@meta.data$seurat_clusters))){
# ploty <- paste0(ploty, "list_UMAPs_by_cluster[[", i, "]]", " + ")
#}
## plot
list_UMAPs_by_cluster[[1]] + list_UMAPs_by_cluster[[2]] + list_UMAPs_by_cluster[[3]] + list_UMAPs_by_cluster[[4]] + list_UMAPs_by_cluster[[5]] + list_UMAPs_by_cluster[[6]] + list_UMAPs_by_cluster[[7]] + list_UMAPs_by_cluster[[8]] + list_UMAPs_by_cluster[[9]] + list_UMAPs_by_cluster[[10]] + list_UMAPs_by_cluster[[11]] + list_UMAPs_by_cluster[[12]] + list_UMAPs_by_cluster[[13]] + list_UMAPs_by_cluster[[14]] + list_UMAPs_by_cluster[[15]] + list_UMAPs_by_cluster[[16]] + list_UMAPs_by_cluster[[17]] + list_UMAPs_by_cluster[[18]] + list_UMAPs_by_cluster[[19]] + list_UMAPs_by_cluster[[20]] + list_UMAPs_by_cluster[[21]] + list_UMAPs_by_cluster[[22]]
Make individual plots highlighting where cells in each cluster fall
plot
<<<<<<< HEAD## for loop which takes each cluster and makes a list of cells and then plots a highlighted plot and adds it to a list
## make a blank list
list_UMAPs_by_cluster <- vector(mode = "list", length = length(levels(tenx.mutant.integrated@meta.data$integrated_snn_res.1.2)))
## for loop
for(i in seq_along(levels(tenx.mutant.integrated@meta.data$integrated_snn_res.1.2))){
## make a list of cells
list_of_cells <- rownames(tenx.mutant.integrated@meta.data[which(tenx.mutant.integrated@meta.data$integrated_snn_res.1.2 == levels(tenx.mutant.integrated@meta.data$integrated_snn_res.1.2)[i]), ])
uamp_plot <- DimPlot(tenx.mutant.integrated, label = FALSE, repel = TRUE, pt.size = 0.1, cells.highlight = list_of_cells, dims = c(2,1), reduction = "DIM_UMAP") +
## fix coordinates
coord_fixed() +
scale_color_manual(values=c("#000000", "#f54e1e")) +
theme_void() +
labs(title = paste("cluster", levels(tenx.mutant.integrated@meta.data$integrated_snn_res.1.2)[i])) +
theme(plot.title = element_text(hjust = 0.5), legend.position = "none")
## add to the list
list_UMAPs_by_cluster[[i]] <- uamp_plot
}
## this function writes the next bit of code for you
## put it into the console and paste the response
#ploty <- c()
#for(i in seq_along(levels(tenx.mutant.integrated@meta.data$seurat_clusters))){
# ploty <- paste0(ploty, "list_UMAPs_by_cluster[[", i, "]]", " + ")
#}
## plot
list_UMAPs_by_cluster[[1]] + list_UMAPs_by_cluster[[2]] + list_UMAPs_by_cluster[[3]] + list_UMAPs_by_cluster[[4]] + list_UMAPs_by_cluster[[5]] + list_UMAPs_by_cluster[[6]] + list_UMAPs_by_cluster[[7]] + list_UMAPs_by_cluster[[8]] + list_UMAPs_by_cluster[[9]] + list_UMAPs_by_cluster[[10]] + list_UMAPs_by_cluster[[11]] + list_UMAPs_by_cluster[[12]] + list_UMAPs_by_cluster[[13]] + list_UMAPs_by_cluster[[14]] + list_UMAPs_by_cluster[[15]] + list_UMAPs_by_cluster[[16]] + list_UMAPs_by_cluster[[17]] + list_UMAPs_by_cluster[[18]] + list_UMAPs_by_cluster[[19]] + list_UMAPs_by_cluster[[20]] + list_UMAPs_by_cluster[[21]] + list_UMAPs_by_cluster[[22]] + list_UMAPs_by_cluster[[23]] + list_UMAPs_by_cluster[[24]]
Make individual plots highlighting where cells in each cluster fall
<<<<<<< HEAD## 1.5 resolution
list_UMAPs_by_cluster[[1]] + list_UMAPs_by_cluster[[2]] + list_UMAPs_by_cluster[[3]] + list_UMAPs_by_cluster[[4]] + list_UMAPs_by_cluster[[5]] + list_UMAPs_by_cluster[[6]] + list_UMAPs_by_cluster[[7]] + list_UMAPs_by_cluster[[8]] + list_UMAPs_by_cluster[[9]] + list_UMAPs_by_cluster[[10]] + list_UMAPs_by_cluster[[11]] + list_UMAPs_by_cluster[[12]] + list_UMAPs_by_cluster[[13]] + list_UMAPs_by_cluster[[14]] + list_UMAPs_by_cluster[[15]] + list_UMAPs_by_cluster[[16]] + list_UMAPs_by_cluster[[17]] + list_UMAPs_by_cluster[[18]] + list_UMAPs_by_cluster[[19]] + list_UMAPs_by_cluster[[20]] + list_UMAPs_by_cluster[[21]] + list_UMAPs_by_cluster[[22]] + list_UMAPs_by_cluster[[23]] + list_UMAPs_by_cluster[[24]] + list_UMAPs_by_cluster[[25]] + list_UMAPs_by_cluster[[26]]
View
Make individual plots highlighting where cells in each cluster fall
<<<<<<< HEADplot
## for loop which takes each cluster and makes a list of cells and then plots a highlighted plot and adds it to a list
## make a blank list
list_UMAPs_by_cluster <- vector(mode = "list", length = length(levels(tenx.mutant.integrated@meta.data$integrated_snn_res.2)))
## for loop
for(i in seq_along(levels(tenx.mutant.integrated@meta.data$integrated_snn_res.2))){
## make a list of cells
list_of_cells <- rownames(tenx.mutant.integrated@meta.data[which(tenx.mutant.integrated@meta.data$integrated_snn_res.2 == levels(tenx.mutant.integrated@meta.data$integrated_snn_res.2)[i]), ])
uamp_plot <- DimPlot(tenx.mutant.integrated, label = FALSE, repel = TRUE, pt.size = 0.1, cells.highlight = list_of_cells, dims = c(2,1), reduction = "DIM_UMAP") +
## fix coordinates
coord_fixed() +
scale_color_manual(values=c("#000000", "#f54e1e")) +
theme_void() +
labs(title = paste("cluster", levels(tenx.mutant.integrated@meta.data$integrated_snn_res.2)[i])) +
theme(plot.title = element_text(hjust = 0.5), legend.position = "none")
## add to the list
list_UMAPs_by_cluster[[i]] <- uamp_plot
}
## check number of clusters
#length(list_UMAPs_by_cluster)
View
Make individual plots highlighting where cells in each cluster fall
<<<<<<< HEADplot
## for loop which takes each cluster and makes a list of cells and then plots a highlighted plot and adds it to a list
## make a blank list
list_UMAPs_by_cluster <- vector(mode = "list", length = length(levels(tenx.mutant.integrated@meta.data$integrated_snn_res.4)))
## for loop
for(i in seq_along(levels(tenx.mutant.integrated@meta.data$integrated_snn_res.4))){
## make a list of cells
list_of_cells <- rownames(tenx.mutant.integrated@meta.data[which(tenx.mutant.integrated@meta.data$integrated_snn_res.4 == levels(tenx.mutant.integrated@meta.data$integrated_snn_res.4)[i]), ])
uamp_plot <- DimPlot(tenx.mutant.integrated, label = FALSE, repel = TRUE, pt.size = 0.1, cells.highlight = list_of_cells, dims = c(2,1), reduction = "DIM_UMAP") +
## fix coordinates
coord_fixed() +
scale_color_manual(values=c("#000000", "#f54e1e")) +
theme_void() +
labs(title = paste("cluster", levels(tenx.mutant.integrated@meta.data$integrated_snn_res.4)[i])) +
theme(plot.title = element_text(hjust = 0.5), legend.position = "none")
## add to the list
list_UMAPs_by_cluster[[i]] <- uamp_plot
}
## check number of clusters
#length(list_UMAPs_by_cluster)
16:30:10 UMAP embedding parameters a = 0.9922 b = 1.112
Found more than one class "dist" in cache; using the first, from namespace 'spam'
Also defined by ‘BiocGenerics’
16:30:10 Read 9222 rows and found 10 numeric columns
16:30:10 Using Annoy for neighbor search, n_neighbors = 30
Found more than one class "dist" in cache; using the first, from namespace 'spam'
Also defined by ‘BiocGenerics’
16:30:10 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
16:30:13 Writing NN index file to temp file /var/folders/wj/rztzclxn1t10cl2sk0plbf3r0000gn/T//RtmpEC5Ztx/file163304a03eaa1
16:30:13 Searching Annoy index using 1 thread, search_k = 3000
16:30:17 Annoy recall = 100%
16:30:18 Commencing smooth kNN distance calibration using 1 thread
16:30:20 Initializing from normalized Laplacian + noise
16:30:21 Commencing optimization for 500 epochs, with 377596 positive edges
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
16:30:47 Optimization finished
Computing nearest neighbor graph
Computing SNN
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
Number of nodes: 9222
Number of edges: 220909
Running Louvain algorithm with multilevel refinement...
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Maximum modularity in 10 random starts: 0.9265
Number of communities: 34
Elapsed time: 1 seconds
=======
>>>>>>> directory_change
plot
<<<<<<< HEAD## for loop which takes each cluster and makes a list of cells and then plots a highlighted plot and adds it to a list
## make a blank list
list_UMAPs_by_cluster <- vector(mode = "list", length = length(levels(tenx.mutant.integrated@meta.data$integrated_snn_res.1)))
## for loop
for(i in seq_along(levels(tenx.mutant.integrated@meta.data$integrated_snn_res.1))){
## make a list of cells
list_of_cells <- rownames(tenx.mutant.integrated@meta.data[which(tenx.mutant.integrated@meta.data$integrated_snn_res.1 == levels(tenx.mutant.integrated@meta.data$integrated_snn_res.1)[i]), ])
uamp_plot <- DimPlot(tenx.mutant.integrated, label = FALSE, repel = TRUE, pt.size = 0.1, cells.highlight = list_of_cells, dims = c(2,1), reduction = "DIM_UMAP") +
## fix coordinates
coord_fixed() +
scale_color_manual(values=c("#000000", "#f54e1e")) +
theme_void() +
labs(title = paste("cluster", levels(tenx.mutant.integrated@meta.data$integrated_snn_res.1)[i])) +
theme(plot.title = element_text(hjust = 0.5), legend.position = "none")
## add to the list
list_UMAPs_by_cluster[[i]] <- uamp_plot
}
## check number of clusters
length(list_UMAPs_by_cluster)
[1] 34
=======
>>>>>>> directory_change
We will look in more detail at cells as they enter the sexual trajecotry later. The PCA clustering will be more appropriate in this high-resolution view. In order to subset these cells, we will use the UMAP clustering.
<<<<<<< HEAD## generate final clusters which will be written into the 'seurat.clusters' slot in meta data
tenx.mutant.integrated <- FindNeighbors(tenx.mutant.integrated, dims = 1:10, reduction = "umap")
tenx.mutant.integrated <- FindClusters(tenx.mutant.integrated, resolution = 1, random.seed = 42, algorithm = 2)
## Plot
DimPlot(tenx.mutant.integrated, label = TRUE, repel = TRUE, pt.size = 0.05, group.by = "seurat_clusters", dims = c(2,1), reduction = "DIM_UMAP") + coord_fixed()
=======
>>>>>>> directory_change
We will get some high level insight into these clusters now
<<<<<<< HEADv1 <- VlnPlot(object = tenx.mutant.integrated, features = "nFeature_RNA", group.by = "seurat_clusters", pt.size = 0.01)
v2 <- VlnPlot(object = tenx.mutant.integrated, features = "nCount_RNA", group.by = "seurat_clusters", pt.size = 0.01)
v1 + v2
=======
>>>>>>> directory_change
We have defined clusters, now we will identify what the clusters correspond to. We can use a number of external datasets to do this:
known marker genes
bulk RNA-seq data correlation
## make plots
plots <- FeaturePlot(tenx.mutant.integrated, features = c("PBANKA-1319500", "PBANKA-0416100"), blend = TRUE, combine = FALSE, coord.fixed = TRUE, dims = c(2,1), reduction = "DIM_UMAP")
# Get just the co-expression plot, built-in legend is meaningless for this plot
#plots[[3]] + NoLegend()
# Get just the key
#plots[[4]]
# Stitch the co-expression and key plots together
plots[[3]] + NoLegend() + plots[[4]]/plot_spacer() + plot_layout(widths = c(2,1))
=======
>>>>>>> directory_change
marker genes plots
<<<<<<< HEAD## make plots
plots <- FeaturePlot(tenx.mutant.integrated, features = c("PBANKA-1319500", "PBANKA-0416100"), blend = TRUE, combine = FALSE, coord.fixed = TRUE, dims = c(2,1), reduction = "DIM_UMAP")
# Get just the co-expression plot, built-in legend is meaningless for this plot
#plots[[3]] + NoLegend()
# Get just the key
#plots[[4]]
# Stitch the co-expression and key plots together
plots[[3]] + NoLegend() + plots[[4]]/plot_spacer() + plot_layout(widths = c(2,1))
Then define each cluster as Male, Female or Asexual:
## find a good ring marker, to see if there is a better one than the ones reported
#markers_ring <- FindMarkers(tenx.mutant.integrated, ident.1 = c("4", "5", "16", "11", "7", "3", "9", "0", "22"))
#head(markers_ring)
# PBANKA-1319500 - CCP2 - female - used in 820 line
# PBANKA-0416100 - MG1 - dynenin heavy chain - male - used in 820 line
# PBANKA-1437500 - AP2G - commitment
# PBANKA-0831000 - MSP1 - late asexual
# PBANKA-1102200 - MSP8 - early asexual (from Bozdech paper)
# PBANKA-0711900 - HSP70 - promoter used for GFP and RFP expression in the mutants
# PBANKA-1400400 - FAMB - ring marker
marker_gene_plot_CCP2 <- FeaturePlot(tenx.mutant.integrated, features = "PBANKA-1319500", coord.fixed = TRUE, min.cutoff = "q1", dims = c(2,1), reduction = "DIM_UMAP") +
theme_void() +
labs(title = paste("CCP2 (Female)")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold"))
marker_gene_plot_MG1 <- FeaturePlot(tenx.mutant.integrated, features = "PBANKA-0416100", coord.fixed = TRUE, min.cutoff = "q1", dims = c(2,1), reduction = "DIM_UMAP") +
theme_void() +
labs(title = paste("MG1 (Male)")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold"))
marker_gene_plot_AP2G <- FeaturePlot(tenx.mutant.integrated, features = "PBANKA-1437500", coord.fixed = TRUE, min.cutoff = "q1", dims = c(2,1), reduction = "DIM_UMAP") +
theme_void() +
labs(title = paste("AP2G (Commitment)")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold"))
marker_gene_plot_MSP1 <- FeaturePlot(tenx.mutant.integrated, features = "PBANKA-0831000", coord.fixed = TRUE, min.cutoff = "q1", dims = c(2,1), reduction = "DIM_UMAP") +
theme_void() +
labs(title = paste("MSP1 (Schizont)")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold"))
marker_gene_plot_MSP8 <- FeaturePlot(tenx.mutant.integrated, features = "PBANKA-1102200", coord.fixed = TRUE, min.cutoff = "q1", dims = c(2,1), reduction = "DIM_UMAP") +
theme_void() +
labs(title = paste("MSP8 (Asexual)")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold"))
marker_gene_plot_SBP1 <- FeaturePlot(tenx.mutant.integrated, features = "PBANKA-1101300", coord.fixed = TRUE, min.cutoff = "q1", dims = c(2,1), reduction = "DIM_UMAP") +
theme_void() +
labs(title = paste("SBP1 (Ring)")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold"))
marker_gene_plot_FAMB <- FeaturePlot(tenx.mutant.integrated, features = "PBANKA-1400400", coord.fixed = TRUE, min.cutoff = "q1", dims = c(2,1), reduction = "DIM_UMAP") +
theme_void() +
labs(title = paste("FAMB (Ring)")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold"))
marker_gene_plot_HSP70 <- FeaturePlot(tenx.mutant.integrated, features = "PBANKA-0711900", coord.fixed = TRUE, min.cutoff = "q1", dims = c(2,1), reduction = "DIM_UMAP") +
theme_void() +
labs(title = paste("(HSP70; Reporter)","\n", "PBANKA_0711900")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold"))
##original label:
# labs(title = paste("(CCP2; Female)","\n", "PBANKA_1319500"))
## plot
marker_gene_plot_FAMB + marker_gene_plot_MSP8 + marker_gene_plot_MSP1 + marker_gene_plot_AP2G + marker_gene_plot_CCP2 + marker_gene_plot_MG1 + marker_gene_plot_HSP70
Then define each cluster as Male, Female or Asexual:
>>>>>>> directory_change## make a custom pal
# 1 = blue - "#0052c5"
# 2 = red - "#a52b1e"
# 3 = green - "#016c00"
# 4 = yellow - "#ffe400"
pal_sex <- c("#0052c5","#ffe400", "#a52b1e", "#016c00")
DimPlot(tenx.mutant.integrated, label = FALSE, repel = TRUE, pt.size = 0.1, group.by = "cluster_colours_figure", dims = c(2,1), reduction = "DIM_UMAP") +
coord_fixed() +
scale_colour_manual(values=pal_sex) +
theme_void() +
theme(legend.position = "none")
## make a custom pal
# 1 = blue - "#0052c5"
# 2 = red - "#a52b1e"
# 3 = green - "#016c00"
# 4 = yellow - "#ffe400"
pal_sex <- c("#0052c5","#ffe400", "#a52b1e", "#016c00")
DimPlot(tenx.mutant.integrated, label = FALSE, repel = TRUE, pt.size = 0.1, group.by = "cluster_colours_figure", dims = c(2,1), reduction = "DIM_UMAP") +
coord_fixed() +
scale_colour_manual(values=pal_sex) +
theme_void() +
theme(legend.position = "none")
The original method of plotting by experiment does not allow much customisation of the plots. I.e. we cannot easily change the titles of each plot
But, we can use the following code to do this
Make final plots:
<<<<<<< HEADp1 <- plot.list$`10X_WT` +
coord_fixed() +
theme_void() +
scale_color_manual(values=c(replicate(45, "#999999"))) +
labs(title = paste("10X (wild-type)")) +
theme(legend.position = "none", plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold"))
p2 <- plot.list$WT +
coord_fixed() +
theme_void() +
scale_color_manual(values=c(replicate(46, "#999999"))) +
labs(title = paste("Smart-seq2 (wild-type)")) +
theme(legend.position = "none", plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold"))
p3 <- plot.list$Mutant +
coord_fixed() +
theme_void() +
scale_color_manual(values=c(replicate(46, "#999999"))) +
labs(title = paste("Smart-seq2 (mutant)")) +
theme(legend.position = "none", plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold"))
p1 + p2 + p3
save
<<<<<<< HEADggsave("/Users/Andy/GCSKO/GCSKO_analysis_git/images_to_export/umap_paper_figure.png", plot = UMAP_composite, device = "png", path = NULL, scale = 1, width = 30, height = 30, units = "cm", dpi = 300, limitsize = TRUE)
=======
>>>>>>> directory_change
Specific gene expression of mutants
<<<<<<< HEADggsave("/Users/Andy/GCSKO/GCSKO_analysis_git/images_to_export/umap_paper_figure.png", plot = UMAP_composite, device = "png", path = NULL, scale = 1, width = 30, height = 30, units = "cm", dpi = 300, limitsize = TRUE)
=======
>>>>>>> directory_change
save
<<<<<<< HEADggsave("/Users/Andy/GCSKO/GCSKO_analysis_git/images_to_export/umap_mutants_paper_figure.png", plot = mutant_expression_composite, device = "png", path = NULL, scale = 1, width = 30, height = 30, units = "cm", dpi = 300, limitsize = TRUE)
=======
>>>>>>> directory_change
All the mutant genotypes profiled were:
<<<<<<< HEADggsave("/Users/Andy/GCSKO/GCSKO_analysis_git/images_to_export/umap_mutants_paper_figure.png", plot = mutant_expression_composite, device = "png", path = NULL, scale = 1, width = 30, height = 30, units = "cm", dpi = 300, limitsize = TRUE)
=======
>>>>>>> directory_change
<<<<<<< HEAD
## make a list of possible genotypes
unique(tenx.mutant.integrated@meta.data$identity_updated)
[1] NA "GCSKO-oom" "WT" "GCSKO-29" "GCSKO-21"
[6] "GCSKO-28" "GCSKO-17" "GCSKO-2" "GCSKO-19" "GCSKO-20"
[11] "GCSKO-13" "GCSKO-10_820" "GCSKO-3"
=======
>>>>>>> directory_change
We will use the following marker genes:
plot expression of these marker genes in each cluster
<<<<<<< HEAD## copy the clusters so you don't permanently edit the master
tenx.mutant.integrated@meta.data$seurat_clusters_plotting <- tenx.mutant.integrated@meta.data$seurat_clusters
## reorder the levels so you can plot the cluters as you wish
my_levels <- c(asex_clusters, bipotential_clusters, male_clusters, female_clusters)
## reorder the levels
tenx.mutant.integrated@meta.data$seurat_clusters_plotting <- factor(x = tenx.mutant.integrated@meta.data$seurat_clusters_plotting, levels = my_levels)
## plot
dot_plot_markers <- DotPlot(tenx.mutant.integrated, features = c("PBANKA-1319500", "PBANKA-0416100", "PBANKA-1437500", "PBANKA-0831000", "PBANKA-1102200"), group.by = "seurat_clusters_plotting") +
theme_classic() +
# change appearance and remove axis elements, and make room for arrows
theme(axis.text.x = element_text(size=16, angle = 45, hjust=1,vjust=1, family = "Arial"), text=element_text(size=16, family="Arial"), legend.position = "bottom", legend.direction = "horizontal", legend.box = "vertical", plot.title = element_blank(), plot.margin = unit(c(1,3,1,3), "lines")) +
#change the colours
scale_colour_viridis(option = "inferno", guide = "colourbar", na.value="white", begin = 0, end = 1, direction = 1) +
## change x axis label
labs(x = "Marker Genes", y = "Cluster", title = "Expression of Marker Genes by Cluster") +
## add arrows
#annotate("segment", x = 5.5, xend = 5.5, y = 21.5, yend = 25, colour = "green", size=1, alpha=1, arrow=arrow(length=unit(0.30,"cm"), type = "closed")) +
#annotate("segment", x = 5.5, xend = 5.5, y = 16.5, yend = 21.5, colour = "red", size=1, alpha=1, arrow=arrow(length=unit(0.30,"cm"), type = "closed")) +
#annotate("segment", x = 5.5, xend = 5.5, y = 0, yend = 15.5, colour = "grey", size=1, alpha=1, arrow=arrow(length=unit(0.30,"cm"), type = "closed")) +
## annotate males
geom_hline(aes(yintercept = 28.5)) +
## annotate females
geom_hline(aes(yintercept = 24.5)) +
## annotate hermaphrodite
geom_hline(aes(yintercept = 23.5)) +
## change label on bottom of plot so we can indicate markers
scale_x_discrete(labels = c(paste("PBANKA-1102200","\n", "(MSP8; early asexual)"), paste("PBANKA-0831000","\n", "(MSP1; late asexual)"), paste("PBANKA-1437500", "\n", "(AP2G; sexual commitment)"), paste("PBANKA-0416100", "\n", "(MG1; male)"), paste("PBANKA-1319500", "\n", "(CCP2; female)")))
## view
print(dot_plot_markers)
gene identities for the mutants profiled
## copy the clusters so you don't permanently edit the master
tenx.mutant.integrated@meta.data$seurat_clusters_plotting <- tenx.mutant.integrated@meta.data$seurat_clusters
## reorder the levels so you can plot the cluters as you wish
my_levels <- c(asex_clusters, bipotential_clusters, male_clusters, female_clusters)
## reorder the levels
tenx.mutant.integrated@meta.data$seurat_clusters_plotting <- factor(x = tenx.mutant.integrated@meta.data$seurat_clusters_plotting, levels = my_levels)
## plot
dot_plot_markers <- DotPlot(tenx.mutant.integrated, features = c("PBANKA-1319500", "PBANKA-0416100", "PBANKA-1437500", "PBANKA-0831000", "PBANKA-1102200"), group.by = "seurat_clusters_plotting") +
theme_classic() +
# change appearance and remove axis elements, and make room for arrows
theme(axis.text.x = element_text(size=16, angle = 45, hjust=1,vjust=1, family = "Arial"), text=element_text(size=16, family="Arial"), legend.position = "bottom", legend.direction = "horizontal", legend.box = "vertical", plot.title = element_blank(), plot.margin = unit(c(1,3,1,3), "lines")) +
#change the colours
scale_colour_viridis(option = "inferno", guide = "colourbar", na.value="white", begin = 0, end = 1, direction = 1) +
## change x axis label
labs(x = "Marker Genes", y = "Cluster", title = "Expression of Marker Genes by Cluster") +
## add arrows
#annotate("segment", x = 5.5, xend = 5.5, y = 21.5, yend = 25, colour = "green", size=1, alpha=1, arrow=arrow(length=unit(0.30,"cm"), type = "closed")) +
#annotate("segment", x = 5.5, xend = 5.5, y = 16.5, yend = 21.5, colour = "red", size=1, alpha=1, arrow=arrow(length=unit(0.30,"cm"), type = "closed")) +
#annotate("segment", x = 5.5, xend = 5.5, y = 0, yend = 15.5, colour = "grey", size=1, alpha=1, arrow=arrow(length=unit(0.30,"cm"), type = "closed")) +
## annotate males
geom_hline(aes(yintercept = 28.5)) +
## annotate females
geom_hline(aes(yintercept = 24.5)) +
## annotate hermaphrodite
geom_hline(aes(yintercept = 23.5)) +
## change label on bottom of plot so we can indicate markers
scale_x_discrete(labels = c(paste("PBANKA-1102200","\n", "(MSP8; early asexual)"), paste("PBANKA-0831000","\n", "(MSP1; late asexual)"), paste("PBANKA-1437500", "\n", "(AP2G; sexual commitment)"), paste("PBANKA-0416100", "\n", "(MG1; male)"), paste("PBANKA-1319500", "\n", "(CCP2; female)")))
Scale for 'colour' is already present. Adding another scale for 'colour', which
will replace the existing scale.
## view
print(dot_plot_markers)
plot expression of these mutant genes by cluster
<<<<<<< HEAD## plot
dot_plot_mutant_genes <- DotPlot(tenx.mutant.integrated, features = c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0102400", "PBANKA-0716500", "PBANKA-1435200", "PBANKA-1418100", "PBANKA-1144800", "PBANKA-0902300", "PBANKA-0413400", "PBANKA-1454800"), group.by = "seurat_clusters_plotting") +
theme_classic() +
## change appearance and remove axis elements, and make room for arrows, and also change posoition of legends relative to one another
theme(axis.text.x = element_text(size=12, angle = 45, hjust=1,vjust=1), legend.position = "bottom", legend.direction = "horizontal", legend.box = "vertical", plot.margin = unit(c(1,3,1,3), "lines"), text=element_text(size=16, family="Arial")) +
##add these to above to remove y = plot.title = element_blank(), axis.text.y = element_blank(), axis.ticks.y = element_blank(), axis.title.y = element_blank()
## change the colours
scale_colour_viridis(option = "inferno", guide = "colourbar", na.value="white", begin = 0, end = 1, direction = 1) +
## change x axis label
labs(x = "Mutant Genes", title = "Expression of mutant genes by cluster", y = "Cluster") +
## annotate males
geom_hline(aes(yintercept = 28.5)) +
## annotate females
geom_hline(aes(yintercept = 24.5)) +
## annotate hermaphrodite
geom_hline(aes(yintercept = 23.5)) +
## change label on bottom of plot so we can indicate markers
scale_x_discrete(labels = c(paste("PBANKA_1454800","\n", "(GCSKO 21)"),
paste("PBANKA-0413400","\n", "(GCSKO 10)"),
paste("PBANKA-0902300", "\n", "(GCSKO 13)"),
paste("PBANKA-1144800", "\n", "(GCSKO 28)"),
paste("PBANKA-1418100", "\n", "(GCSKO 17)"),
paste("PBANKA-1435200", "\n", "(GCSKO 20)"),
paste("PBANKA-0716500", "\n", "(GCSKO 19)"),
paste("PBANKA-0102400", "\n", "(GCSKO 2)"),
paste("PBANKA-1447900", "\n", "(GCSKO 29)"),
paste("PBANKA-1302700", "\n", "(GCSKO oom)"),
paste("PBANKA-0828000", "\n", "(GCSKO 3)")))
## view
print(dot_plot_mutant_genes)
=======
>>>>>>> directory_change
make a metadata column where the 10X data is classified as a WT genotype
Plot expression of mutant genes by cluster (which is subdivided by genotype)
This is kind of a control because the mutant should express less of the gene of interest at some point due to the inclusion of the mutant cells
Add a meta.data column so that 10X is listed as WT:
prepare data for dotplotting
plot
<<<<<<< HEAD =======plot
>>>>>>> directory_changemaybe the respresentation differences have batch-effects:
dot_plot_identity + dot_plot_markers + dot_plot_mutant_genes
=======
>>>>>>> directory_change
<<<<<<< HEAD
dot_plot_identity + dot_plot_markers + dot_plot_mutant_genes
save
19 vs 8 on resolution 2 already looks pretty cool:
look at them across the dataset
19 –> 13 female - 14,15,17,9, male - 25,20,21,7
Make a subsetted Seurat object of sexual cells.
Include the pre-branch too as well as any weird clusters that may have clustered out.
## define cells
## 2 and 0 are at the beginning of the stalk
sex_clusters <- c(bipotential_clusters, female_clusters, male_clusters, "2", "0")
## subset cells into new object
tenx.mutant.integrated.sex <- subset(tenx.mutant.integrated, idents = sex_clusters)
=======
>>>>>>> directory_change
## inspect object
tenx.mutant.integrated.sex
## look at original UMAP
DimPlot(tenx.mutant.integrated.sex, label = TRUE, repel = TRUE, pt.size = 0.1, split.by = "experiment", dims = c(2,1), reduction = "DIM_UMAP") + coord_fixed()
=======
>>>>>>> directory_change
we want to remove:
<<<<<<< HEAD## extract cell embeddings
df_sex_cell_embeddings <- as.data.frame(tenx.mutant.integrated.sex@reductions[["DIM_UMAP"]]@cell.embeddings)
## subset anything lower than -0.75 in UMAP 2 and -7 in UMAP 1
remove_cells <- row.names(df_sex_cell_embeddings[which(df_sex_cell_embeddings$DIMUMAP_2 < 1.9 & df_sex_cell_embeddings$DIMUMAP_1 > 0.15), ])
## plot these cells
DimPlot(tenx.mutant.integrated.sex, label = FALSE, repel = TRUE, pt.size = 0.1, cells.highlight = remove_cells, dims = c(2,1), reduction = "DIM_UMAP") +
coord_fixed() +
scale_color_manual(values=c("#000000", "#f54e1e")) +
theme_void() +
#labs(title = paste("(Mutant oom)","\n", "PBANKA_1302700")) +
theme(plot.title = element_text(hjust = 0.5), legend.position = "none")
=======
>>>>>>> directory_change
## make keep cells from the remove_cells
## make the not in function
'%ni%' <- Negate('%in%')
keep_cells <- colnames(tenx.mutant.integrated.sex)[which(colnames(tenx.mutant.integrated.sex) %ni% remove_cells)]
## subset
tenx.mutant.integrated.sex <- subset(tenx.mutant.integrated.sex, cells = keep_cells)
## inspect
tenx.mutant.integrated.sex
=======
>>>>>>> directory_change
copy old clusters over
Save environment
Save object(s)
a new object will be the MCA dataset:
(file:///Users/ar19/Desktop/PhD/MCA_Publication/MCA_10X_SS2_Data.nb.html) ss2_molecules <- read.csv(“/Users/ar19/Desktop/mca.qc.tmm_norm_counts_20180625.csv”, header = TRUE, row.names=1, stringsAsFactors = TRUE) ss2_anno <- read.csv(“/Users/ar19/Desktop/QC_pheno_20180625.csv”, header = TRUE, stringsAsFactors = TRUE, row.names = 1) (file:///Users/ar19/Desktop/PhD/MCA_Publication/MCA%20-%20Genes%20Captured%20Analysis.nb.html)
molecules <- read.table(“/Volumes/team222/MCA/Countfiles/allcounts3.csv”, header = TRUE, sep = “,”, row.names=1, stringsAsFactors = TRUE) anno <- read.delim(“/Volumes/team222/MCA/Countfiles/allpheno5.csv”, header = TRUE, sep = “,”)
– Subset only 10X cells
– cluster 24 is predet cells – cluster 29 is post cells – cluster 36 is post cells
Not very customisable, so make a ggplot2 workaround:
option 2:
plot together
pathwork way:
solved the monocle issue by following:
here: https://github.com/r-spatial/sf
then devtools::install_github(‘cole-trapnell-lab/monocle3’)
optimise UMAP
optimising UMAP 2
feature plot with viridis:
Export just smart-seq2 and just 10x
## extract only cells in cluster 14:
seurat.object.subset <- SubsetData(tenx.mutant.integrated, subset.name = "seurat_clusters", accept.value = c("24"))
##get their names:
names_of_cells_in_cluster_24 <- colnames(seurat.object.subset@assays$RNA@counts)
## Subset 10X Dataset
tenx_cluster_24 <- SubsetData(pb_sex_filtered, cells = names_of_cells_in_cluster_24)
tenx_cluster_24_matrix <- as(as.matrix(GetAssayData(tenx_cluster_24, assay = "RNA", slot = "data")), 'sparseMatrix')
#pd <- data.frame(seurat.object.subset@meta.data)
## Subset SS2 Dataset
ss2_cluster_24 <- SubsetData(ss2_mutants_final, cells = names_of_cells_in_cluster_24)
ss2_cluster_24_matrix <- as(as.matrix(GetAssayData(ss2_cluster_24, assay = "RNA", slot = "data")), 'sparseMatrix')
## write to file
write.csv(ss2_cluster_24_matrix, file = "~/data_to_export/ss2_cluster_24_matrix.csv")
write.csv(tenx_cluster_24_matrix, file = "~/data_to_export/tenx_cluster_24_matrix.csv")